Towards Hierarchical Prosodic Prominence Generation in TTS Synthesis
نویسندگان
چکیده
We address the problem of identification (from text) and generation of pitch accents in HMM-based English TTS synthesis. We show, through a large scale perceptual test, that a large improvement of the binary discrimination between pitch accented and non-accented words has no effect on the quality of the speech generated by the system. On the other side adding a third accent type that emphatically marks words that convey ”contrastive” focus (automatically identified from text) produces beneficial effects on the synthesized speech. These results support the accounts on prosodic prominence that consider the prosodic patterns of utterances as hierarchical structured and point out the limits of a flattening of such structure resulting from a simple accent/non-accent distinction.
منابع مشابه
A Data-driven Adaptation of Prosody in a Multilingual TTS
Proper accentuation and phrasing make the syntactic and semantic structure of the message more transparent to the listener. Therefore a good modeling of prosody in a TTS system has to be structured into appropriate levels. The implemented prosodic hierarchy should guide the listeners’ attention and help in support of the comprehension process. Since prosody functions as a distractor, it is very...
متن کاملIdentifying prosodic prominence patterns for English text-to-speech synthesis
This thesis proposes to improve and enrich the expressiveness of English Textto-Speech (TTS) synthesis by identifying and generating natural patterns of prosodic prominence. In most state-of-the-art TTS systems the prediction from text of prosodic prominence relations between words in an utterance relies on features that very loosely account for the combined effects of syntax, semantics, word i...
متن کاملDesigning prosodic databases for automatic modelling in 6 languages
We describe the design and creation of prosodic speech databases for 6 languages. The purpose of the databases is to allow derivation of prosody models in order to improve TTS synthesis. The main prosodic variables to model were word prominence, prosodic boundary strength and phone duration. We describe the database structure and contents and the methodology for creating prosodic databases, and...
متن کاملAn Environment for Word Prominence Classification in Slovenian Language
Besides phrasing, prominence is one of the most important parameters of speech prosody to model. The so called data driven approaches nowadays seem to be the appropriate solution for prosody modeling in current text to speech (TTS) systems. They allow prosodic regularities to be automatically extracted from a prosodic database of natural speech. In this paper we’ll present an evaluation of suit...
متن کاملProminence detected by listeners for future speech synthesis application
The point of interest in the present investigation is to find out and to make a pilot statistical presentation of the prominence distinguished by native speakers in read aloud texts taken from the Russian corpus for text-to-speech unit-selection synthesis. The TTS system uses the linguistic information encoded in the input text. Therefore the parameters which are easily extracted from the text ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2012